DATAX121-23A (HAM) & (SEC) - Introduction to Statistical Methods
Counter-intuitively, we often check the third and fourth assumptions after fitting a one-way ANOVA model to the data
The R function, lm(), is “saying” the following equation, the means-only model, is appropriate for the data
\[ \begin{aligned} y_{ij} = \mu_i + \varepsilon_{ij}, &~ \text{where} ~ \varepsilon_{ij} \sim \text{Normal}(0, \sigma_\varepsilon) ~ \text{and} \\ &~ \text{we have up to} ~ k ~ \text{different values of} ~ i \end{aligned} \]
By fitting the model first, we note that the residuals, \(\varepsilon_{ij}\), are what we want to have a similar spread after accounting for the group means, and we want to be Normally distributed
\[ \begin{aligned} y_{ij} = \mu_i + \varepsilon_{ij}, &~ \text{where} ~ \varepsilon_{ij} \sim \text{Normal}(0, \sigma_\varepsilon) ~ \text{and} \\ &~ \text{we have up to} ~ k ~ \text{different values of} ~ i \end{aligned} \]
where:
The best estimate of the \(\mu_i\)s are the \(\bar{x}_i\)s, where the best estimate of \(\sigma_\varepsilon\) is \(\sqrt{MSR}\)
So for CS 2.2 the equation of the fitted model is:
\(\text{GillRate}_{ij} = \mu_i + \varepsilon_{ij}\), where \(\varepsilon_{ij} \sim \text{Normal}(0, \sigma_\varepsilon)\) and \(i\) can either be Low, Medium, or High
The ratio of the group with the largest sample standard deviation and the group with the smallest sample standard deviation is less than 2. That is,
This diagnostic plot is a scatter plot known as the Residuals versus Fitted plot
If the one-way ANOVA (means-only) model is appropriate for the data:
If the fitted model is a one-way ANOVA, an observation’s fitted value is equal to the sample mean of the group it belongs to
So for CS 2.2 we should expect the fitted values to correspond to 58.17 bpm, 68.5 bpm, and 58.67 bpm for fish in tanks with high, low, and medium calcium levels, respectively
As per Slides 3–4, it is the residuals that need to be Normally distributed
We should only check this assumption if the third assumption has been satisfied
If the one-way ANOVA (means-only) model is appropriate for the data:
So for CS 2.2 we may not trust neither the inference made with the F-test nor the multiple comparisons
This diagnostic plot is a scatter plot known as the Normal Q-Q plot
For any linear model that assumes the residuals are Normally distributed: The values of observed residuals (y-axis) agree with their theoretical values (x-axis)
It was of interest to see whether the average height of choir singers differs by voice part. A random sample of 237 singers from a choral society was done to collect data to answer this question.
| Variables | |
|---|---|
| Height | A number denoting a singer’s height (in cm) |
| Part | A factor denoting a singer’s voice part, either Soprano, Alto, Tenor, or Bass |
Describe any features of the data
Each voice part is unimodal. (The extra mode for the Altos is an artefact of the histogram, as the height of its’ 170–175 cm bar is two observations!)
Within voice part: The distribution of heights all seem relatively symmetrical, suggesting that each group’s sample mean is a good measure of centre
Notably, on average, the tenors and basses seem taller than the sopranos and altos. This makes sense once you consider men can only be tenors and basses, and women can only be sopranos and altos
The equation of the fitted model?
\(\text{Height}_{ij} = \mu_i + \varepsilon_{ij}\), where \(\varepsilon_{ij} \sim \text{Normal}(0, \sigma_\varepsilon)\) and \(i\) can either be Soprano, Alto, Tenor, or Bass
A random sample of singers from a choral society was conducted. So, by chance, the sample proportion of each voice part should agree with each part’s corresponding unknown population proportion.
Since the first assumption was met because of random sampling, we also met this assumption.
The diagnostic plot to help assess this assumption, Residuals vs Fitted, suggests it is fine. Most of the “strange” residuals are from taller observations.
The diagnostic plot to help assess this assumption, Normal Q-Q, suggests that once we account for the differences between sample means, the residuals are plausibly Normally distributed as most residuals do not deviate from the dashed line
The hypothesis statements for the F-test in the context of E 9.1 are: \[ H_0\!: \mu_S = \mu_A = \mu_T = \mu_B \\ H_1\!: \text{At least one} ~ \mu_i \neq \mu_j \]
Interpret the result of the F-test using a 5% significance level
We have very strong evidence against the null that the population mean heights of Soprano, Alto, Tenors, and Basses are the same, in favour of the alternative, that there is at least one difference between in the population mean heights of two out of the four voice parts (p-value ≈ 0).
# The table that summarises how the "variability" of the
# heights is explained by the voice part and what's leftover
anova(choir.fit)Analysis of Variance Table
Response: Height
Df Sum Sq Mean Sq F value Pr(>F)
Part 3 12520.9 4173.6 100.97 < 2.2e-16 ***
Residuals 233 9631.5 41.3
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Which pairwise comparisons between means are significant at the 5% level?
All of them according to the 95% confidence intervals and the adjusted p-values!
With 95% confidence, we estimate that:
# Load the emmeans R package
library(emmeans)
# Produce the pairwise confidence intervals and adjust
# them for multiple comparisons
emmeans(choir.fit, ~ Part) |>
pairs(infer = TRUE) contrast estimate SE df lower.CL upper.CL t.ratio p.value
Alto - Bass -13.92 1.13 233 -16.842 -11.00 -12.330 <.0001
Alto - Soprano 3.25 1.14 233 0.296 6.20 2.847 0.0246
Alto - Tenor -10.20 1.28 233 -13.530 -6.88 -7.942 <.0001
Bass - Soprano 17.17 1.12 233 14.284 20.06 15.396 <.0001
Bass - Tenor 3.72 1.26 233 0.451 6.98 2.945 0.0186
Soprano - Tenor -13.45 1.27 233 -16.748 -10.16 -10.570 <.0001
Confidence level used: 0.95
Conf-level adjustment: tukey method for comparing a family of 4 estimates
P value adjustment: tukey method for comparing a family of 4 estimates